{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "bbf2d495",
   "metadata": {},
   "source": [
    "This tutorial demonstrates how to add new labels to an ongoing project in real-time. The included example showcases the use of the `<Taxonomy>` tag, which can be applied to the entire document. However, this same principle can also be used for other cases, such as \n",
    "- modifying labels for computer vision applications (bounding boxes, polygons, etc.), or segments in text and audio using nested `<Taxonomy>` tags. \n",
    "- modify other tags that contain lists of labels, such as `<Choices>` and `<Labels>`.\n",
    "\n",
    "**Note:** This code utilizes functions from an older version of the Label Studio SDK (v0.0.34).\n",
    "The newer versions v1.0 and above still support the functionalities of the old version, but you will need to specify\n",
    "[`label_studio_sdk._legacy`](../README.md) in your script."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "59ca9d96",
   "metadata": {},
   "source": [
    "Here is a list of the necessary items:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "id": "6ed70a8e",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Label Studio instance - replace if in case you use self-hosted app\n",
    "LABEL_STUDIO_URL = \"https://app.heartex.com\"\n",
    "\n",
    "# !! PUT YOUR API KEY HERE - can be retrieved from Account & Settings page\n",
    "LABEL_STUDIO_API_KEY = '<YOUR-API-KEY>'\n",
    "\n",
    "# The project ID where you want to add new labels\n",
    "project_id = 32943\n",
    "\n",
    "# Taxonomy tag name where labels should be added in each project\n",
    "tag_name = 'taxonomy'\n",
    "\n",
    "# JSON payload representing the taxonomy tree to be added\n",
    "update_taxonomy = {\n",
    "    'New Top Level class': 1,\n",
    "    'Eukarya': {\n",
    "        'Cat': 1\n",
    "    },\n",
    "    'New Class': {\n",
    "        'Object': 1,\n",
    "        'Nested classes': {\n",
    "            'And deeper hierarchy': {\n",
    "                'Object': 1\n",
    "            },\n",
    "            'Another Object': 1\n",
    "        }\n",
    "    }\n",
    "}"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "d3a499df",
   "metadata": {},
   "source": [
    "Here is a function `add_taxonomy_nodes` that supports Label Studio XML tree structure and update the nodes given `update_taxonomy` payload. Note that it adds new non-leaf nodes if they are missing:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "id": "3e30115f",
   "metadata": {},
   "outputs": [],
   "source": [
    "from label_studio_sdk import Client\n",
    "import xml.etree.ElementTree as ET\n",
    "\n",
    "\n",
    "def add_nodes_recursive(parent, payload, tag_type='Choice'):\n",
    "    \"\"\"Helper function to walk recursively over taxonomy tree given current 'parent' node\"\"\"\n",
    "    for key, value in payload.items():\n",
    "        # Check if the payload value is a dictionary - meaning nested tags\n",
    "        if isinstance(value, dict):\n",
    "            # Find the parent tag for nested tags\n",
    "            nested_parent = parent.find(f\".//{tag_type}[@value='{key}']\")\n",
    "            # If parent tag is not found, create it\n",
    "            if nested_parent is None:\n",
    "                nested_parent = ET.SubElement(parent, tag_type, {'value': key})\n",
    "            # Add nested tags recursively\n",
    "            add_nodes_recursive(nested_parent, value, tag_type)\n",
    "        else:\n",
    "            # Add top-level tags directly under the parent tag\n",
    "            ET.SubElement(parent, tag_type, {'value': key})\n",
    "\n",
    "def add_new_labels(project_id, tag_name, payload, tag_type='Taxonomy'):\n",
    "    project = ls.get_project(project_id)\n",
    "    print(f'Updating project \"{project.title}\" (ID={project.id})')\n",
    "    label_config = project.label_config\n",
    "    root = ET.fromstring(label_config)\n",
    "    # Locate the desired tag in XML\n",
    "    tag_to_update = root.find(f'.//{tag_type}[@name=\"{tag_name}\"]')\n",
    "    if not tag_to_update:\n",
    "        print(f'No <{tag_type} name=\"{tag_name}\".../> tag found.')\n",
    "        return\n",
    "    # Add nodes recursively\n",
    "    child_tag_type = 'Choice' if tag_type in ('Taxonomy', 'Choices') else 'Label'\n",
    "    add_nodes_recursive(tag_to_update, payload, child_tag_type)\n",
    "    new_label_config = ET.tostring(root, encoding='unicode')\n",
    "    project.set_params(label_config=new_label_config)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "541039da",
   "metadata": {},
   "source": [
    "Now let's try it in action. Here is the current project configuration - it's taken from the default Templates under [Natural Language Processing > Taxonomy](https://labelstud.io/templates/taxonomy.html)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "id": "be5dc146",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "<View>\n",
      "  <Text name=\"text\" value=\"$text\"/>\n",
      "  <Taxonomy name=\"taxonomy\" toName=\"text\">\n",
      "    <Choice value=\"Archaea\" />\n",
      "    <Choice value=\"Bacteria\" />\n",
      "    <Choice value=\"Eukarya\">\n",
      "      <Choice value=\"Human\" />\n",
      "      <Choice value=\"Oppossum\" />\n",
      "      <Choice value=\"Extraterrestial\" />\n",
      "    </Choice>\n",
      "  </Taxonomy>\n",
      "</View>\n"
     ]
    }
   ],
   "source": [
    "ls = Client(url=LABEL_STUDIO_URL, api_key=LABEL_STUDIO_API_KEY)\n",
    "\n",
    "print(ls.get_project(project_id).label_config)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "cfc81f2f",
   "metadata": {},
   "source": [
    "Now let's add new labels from the specified `update_taxonomy` payload:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "id": "2ac18a69",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Updating project \"Taxonomy AutoUpdate\" (ID=32943)\n"
     ]
    }
   ],
   "source": [
    "add_new_labels(project_id, tag_name, update_taxonomy, tag_type='Taxonomy')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "id": "21423b1b",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "<View>\n",
      "  <Text name=\"text\" value=\"$text\" />\n",
      "  <Taxonomy name=\"taxonomy\" toName=\"text\">\n",
      "    <Choice value=\"Archaea\" />\n",
      "    <Choice value=\"Bacteria\" />\n",
      "    <Choice value=\"Eukarya\">\n",
      "      <Choice value=\"Human\" />\n",
      "      <Choice value=\"Oppossum\" />\n",
      "      <Choice value=\"Extraterrestial\" />\n",
      "    <Choice value=\"Cat\" /></Choice>\n",
      "  <Choice value=\"New Top Level class\" /><Choice value=\"New Class\"><Choice value=\"Object\" /><Choice value=\"Nested classes\"><Choice value=\"And deeper hierarchy\"><Choice value=\"Object\" /></Choice><Choice value=\"Another Object\" /></Choice></Choice></Taxonomy>\n",
      "</View>\n"
     ]
    }
   ],
   "source": [
    "print(ls.get_project(project_id).label_config)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "9e2158ca",
   "metadata": {},
   "source": [
    "The existing taxonomy tree has been updated with new labels, including nested hierarchical items. Since only new labels were added, this can be done without interrupting ongoing projects.\n",
    "<div class=\"alert alert-block alert-warning\">\n",
    "<b>WARNING</b> It's important to be cautious when adding new labels during the annotation process, as it could potentially invalidate previously created data and require you to restart the process. Only proceed with adding new labels if you are certain it won't cause any issues.\n",
    "</div>"
   ]
  },
  {
   "cell_type": "markdown",
   "source": [
    "Now you can get all project IDs where you need to add new labels:"
   ],
   "metadata": {
    "collapsed": false,
    "pycharm": {
     "name": "#%% md\n"
    }
   },
   "id": "641c8dd06b76348c"
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "outputs": [],
   "source": [
    "for project in ls.list_projects():\n",
    "    add_new_labels(project.id, tag_name, update_taxonomy)"
   ],
   "metadata": {
    "collapsed": false,
    "pycharm": {
     "name": "#%%\n"
    }
   },
   "id": "fa55edf2c553b707"
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "django-env-kernel",
   "language": "python",
   "name": "django-env-kernel"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.8.12"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}
